Evolution of fairness in the divide-a-lottery game

In this paper, we show that fairness can evolve in the divide-a-lottery game which is more general than the divide-a-dollar game by using an indirect evolutionary approach. In the divide-a-lottery game, the size of a pie is uncertain. Two players sequentially bid for a share and they get their bid if the allocation based on the bids turns out to be feasible and otherwise neither gets anything. In this game, rational players over-compete for a higher share, resulting in a high probability of failure in agreement, whereas fair players who dislike the disparity between shares lower their bids thereby reducing the failure probability and thus increasing the expected payoff. As a result, fairness strictly dominates rationality. This is the mechanism through which fairness evolves. However, this result is not robust against even a slight uncertainty about the opponent’s type. Surprisingly, we show a contrasted simulation result that only rational players who are strictly dominated by fair players survive evolutionarily for most of the parameter values if players have even a slight chance of not knowing the opponent’s type. Our simulation results in a local interaction model in which players only know the type of closer neighbors capture both insights and demonstrate that moderate proportions of both types coexist evolutionarily over time, and that the population average fitness of this polymorphic population is higher than monomorphic population consisting only of fair types or rational types.

Although most economists and game theorists assume that material self-interest is the sole motivation of people, there is overwhelming counter-evidence gathered by psychologists and experimental economists. This evidence indicates that a substantial percentage of human beings are strongly motivated by other-regarding preferences including fairness, altruism etc. (For recent theoretical developments in evolution of prosocial cooperative behavior in various situations, e.g., in heterogeneous network structures, directional networks, and multilayer interactions, see McAvoy et al. 1 , Su et al. 2,3 ). Considering that the selfish behavior, by definition, maximizes the individual's utility or fitness and thus only homo economicus appears to be able to survive in the long run, it is rather puzzling that fair behavior survives in the long run in an evolutionary environment. The evidence of fairness is, however, well documented. For example, in the ultimatum game, a robust result across hundreds of experiments is that the vast majority of the offers are between 40 and 50 percent of the available surplus (see, for example, Güth et al. 4 , Camerer and Thaler 5 , Roth 6 , Camerer 7 ).
In this paper, we show that fairness can evolve in the divide-a-lottery game which is more general than the divide-a-dollar game by using an indirect evolutionary approach. (In the indirect evolutionary approach, which was developed by Güth and Yaari 8 and Güth 9 , preferences are treated as endogenous in an evolutionary process, while actions are still determined by Nash equilibrium). A divide-a-dollar game, which is also known as a Nash demand game 10 is one of the most widely used bargaining games, describing a procedure of how to split a dollar. Unlike the ultimatum game that is sequential in the sense that one player proposes a share and then the other player decides whether to accept or reject it, a divide-a-dollar game is simultaneous. The game goes as follows. Player 1 and player 2 simultaneously bid the amount of x and y respectively to divide a dollar. If the bids turn out to be a feasible division in the sense that x + y ≤ 1 , each of them gets the share of their own bid, but otherwise neither gets anything. In this game, both players have a chance to bid, unlike the ultimatum game.
In real situations, bargaining is usually proceeded sequentially and both players have a chance to bid. Therefore, we consider a combination of the two bargaining games in which the two players offer bids sequentially. Unlike the ultimatum game, however, the value of the pie is uncertain. We call this a divide-a-lottery game. So, in a divide-a-lottery game, players bargain for a lottery by bidding sequentially. (Wang et al. 11 also introduce the randomness associated with the size of pies into the model, but they consider a variant of an ultimatum game, not a variant of a divide-a-dollar game; hence, no sequential bidding in their model). www.nature.com/scientificreports/ If two rational players bid sequentially, there is a first mover advantage. So, the first bidder bids 1 2 and the second bidder bids 1 4 when the value of the lottery is uniformly distributed on [0, 1]. It results in a high probability of disagreement (i.e., infeasible allocation) due to severe bidding competition. However, a fair player who feels disutility from disparate bargaining shares makes the other player reduce the bid, increasing the probability of agreement. The upshot is that fairness has the role of lowering the bid thereby increasing the expected payoff. As a result, fairness strictly dominates rationality. This is the mechanism through which fairness evolves. However, this result is not robust against even a slight uncertainty about the opponent's type. Surprisingly, we show a contrasted simulation result that only rational players who are strictly dominated by fair players survive evolutionarily for most of the parameter values if players have even a slight chance of not knowing the opponent's type. Also, through simulations, we show that moderate proportions of both types coexist evolutionarily over time, and that the population average fitness of this polymorphic population is higher than monomorphic population consisting only of fair types or rational types.
Many authors have demonstrated that fairness evolves in the ultimatum game, not in the divide-a-dollar game nor in the divide-a-lottery game. Nowak et al. 12 highlighted the role of reputation in the evolution of fairness. If players interact repeatedly, accepting low offers as rational players do can induce the next proposer to make a low offer, so a fair strategy offering and demanding a high share can fare better than a rational strategy. Rand et al. 13 introduced the possibility of making mistakes to explain the evolution of fairness. Ichinose and Sayama 14 considered a game which they call not quite ultimatum game in a spatial interaction. Bethwaite and Tompkinson 15 considered players who are concerned about equity of the allocation similar to our model but did not investigate the evolutionary process of fairness.
Our result that fairness can survive evolutionarily in a more general bargaining situation than in the ultimatum game is a novel finding that does not rely on modeling artifacts in the sense that it is not due to repeated interaction (reputation effect) nor spatial structure of interaction.

Model.
We consider a population consisting of a continuum of players of finite measure. Players are classified into two types: rational players (R) and fair players (F). At any time, they are pairwise matched and play a dividea-lottery game. The value of a lottery, denoted by v, is uncertain. We assume that it is uniformly distributed on [0, 1]. The divide-a-lottery game goes as follows. First, one player (player 1) of the matched pair bids x and then the other player (player 2) bids y. After that, the value of v is realized. If it turns out that x + y ≤ v , they get x and y respectively and if x + y > v , neither gets anything. We assume that each player of the matched pair can be either the first player or the second player with equal probability. Also, we assume symmetric role assignment to make our analysis isolated from role assignment. For (reputation-based) role assignment in the dictator game, see Yang et al. 16 and Li et al. 17 .
We assume that when a pair is matched, the preference types of the players are known to each other. Since there are two possible types of players, it implies that four pairing combinations are feasible in a stage game.
Let the material payoff of player i be π i (x, y) for i = 1, 2 . We assume that each player's material payoff is defined as This is the expected value of player i's share, since 1 − x − y = P(x + y ≤ v) , which is the probability of agreement, i.e., the probability that the allocation by bids is feasible. Although the payoff functions in the divide-alottery game are similar to those in the duopoly game, we do not believe that our results will be straightforwardly applied to behavior of firms. In fact, to the best of our knowledge, there is no empirical finding that most firms behave fairly in the duopoly game without maximizing their profits. For endogenous sequencing in the duopoly game, see Dowrick 18 , Boyer and Moreaux 19 , and Hamilton and Slutsky 20 .
When players choose their bids in a stage game, they maximize their subjective utility, not the material payoff. Let U F i represent the subjective utility of fair player i. It is defined by where α ≥ 0 is a parameter to represent how much this individual cares about fairness and d(x, y) is a difference between the shares of the two players. (Some authors assume that the disutility is asymmetric, i.e., disutilities when x > y and x < y are different. However, this preference is not really fair). As α → ∞ , the player is fairer. (Conceptually, it is possible that α = ∞ , but we will restrict our attention to α ∈ [0, 1] to avoid the case that a fair player's payoff is −∞ .) If α → 0 , the player is almost rational. Throughout the article, we assume a simple functional form of d(x, y) = (x − y) 2 . Let U R i be the subjective utility of a rational player. We assume that the subjective utility of a rational player is the same as his material payoff:

Results
In this section, we analyze the two cases, the case in which each player is informed of the type of the opponent he is facing against (complete information case) and the other case in which each player is not informed of the opponent's type (incomplete information case).
Complete information about the opponent's type. We consider two symmetric matching cases, rational player vs. rational player and fair player vs. fair player, and one asymmetric matching case, rational player vs. fair player.
Rational player vs. rational player. Since a stage game is sequential, we use backward induction to obtain the subgame perfect equilibrium. If rational player 1 plays against rational player 2, given the bid of player 1, x, player 2 seeks to maximize his material payoff by choosing so we obtain player 2's best response as a function of x: Taking account of this response, player 1 chooses x to maximize Therefore, equilibrium bids are x * = 1 2 and y * = y R (x * ) = 1 4 . Let π RR be the material expected payoff of a rational player playing against another rational player. The material payoffs of rational player 1 and player 2 are π 1 = 1 8 and π 2 = 1 16 respectively. This shows the first mover advantage. The first mover can choose a higher bid than the second mover who is passive. By increasing his bid before player 2, he can enjoy a strategic advantage. Since a player can be player 1 or player 2 with equal probability, their expected value of the material payoff is Fair player vs. fair player. If a fair player plays against another fair player, player 2 chooses his bid to maximize his subjective utility, taking his opponent's bid x as given: From the first-order condition for maximizing (8), we obtain fair player 2's best response function as Taking this response of player 2 into account, player 1 chooses x to maximize Therefore, we obtain equilibrium bids as x * (α) = 2α 2 +6α+1 ∂α < 0 , x * (0) = 1 2 and x * (1) = 9 29 < 1 3 and y * (1) = 19 58 > 9 29 . As α gets larger, i.e., players get fairer, player 1 bids less to reduce the first mover advantage, and if α = 1 , the first mover advantage disappears completely, because x * (1) < y * (1) . (It is easy to check that y * (α) is not monotonic with respect to α).
If the fitness of a player is determined by his material payoff, we can see from Table 1 that fairness is the dominant strategy for any α ∈ [0, 1] . For the numerical proof, Fig. 1A shows that π F > π RR and π FF > π R for any α ∈ (0, 1] . So, if strategies evolve in proportion to the material payoffs, only fairness can survive evolutionarily in the long run. Here, as a dynamic solution concept, we are using a long-run asymptotic (local) attractor that can be roughly defined by the population distribution to which an initial distribution converges over time whenever it starts from the neighborhood.
Proposition 1 Only fair players can survive evolutionarily if a randomly matched pair in a population plays a divide-a-lottery game and the players know each other's type.
The analytic proof is omitted, because it is well known that a strict Nash equilibrium is an evolutionarily stable strategy (ESS) by Maynard Smith and Price 21 which is a long-run asymptotic attractor. In this game, (F, F) is a strict Nash equilibrium.
At this moment, it is worthwhile to compare this game with the prisoners' dilemma (PD) game. In a PD game, cooperation (C) is strictly dominated by defection (D), but (C, C) yields higher fitness than (D, D). In other words, (C, C) is the collectively rational outcome (socially efficient outcome), whereas (D, D) is the individually rational outcome (privately optimal outcome). The discrepancy is where social dilemma comes from. In our divide-a-lottery game, fairness strictly dominates rational behavior, but unlike the PD game, the individually rational outcome (F, F) yields higher fitness than (R, R). This is the main difference from the PD game. Also, it is interesting to note that the collectively rational outcome in this game is (F, R) and (R, F), not (F, F), for most parameter values except for very small values of α ∈ (0, 0.04) (Fig. 1B). This implies that for most values of α , a polymorphic population is socially better than a monomorphic population consisting only of fair players in terms of the population average fitness.
Since the complete information assumption that drives this result is too strong to properly capture the real world phenomenon, we relax the assumption and consider the incomplete information case in the next section.
Incomplete information about the opponent's type. In this section, we assume that players cannot tell the type of the opponent but only know the proportion of each type. Let p be the proportion of fair players in the population. A rational player 1 chooses x R to maximize The first order condition leads to (12) (15) and (17) into (6) and (9), we get Let π IR i and π IF i be the equilibrium material payoffs of a rational (or fair respectively) player i where i = 1, 2 in the case of incomplete information. Then, we can compute the expected value of the material payoff of each type as where Let us consider the following replicator dynamics where πI = (1 − p t )π IR + p t π IF . Then, we can find the limiting distribution of R and F. Figure 2A shows our simulation results that only the monomorphic population consisting only of rational players emerge as a result of evolution for most parameter values (black region), while a polymorphic population can emerge for high values of α (degree of fairness) and p 0 (initial proportion of fairness) (yellow region). Figure 2B shows the average of material payoffs in the limiting states for various combinations of (p 0 , α) . It implies that the polymorphic population consisting of mixture of rational players and fair players is better than the monomorphic population consisting solely of rational players in terms of the population average fitness. This implies that the population distribution is very unlikely to converge to the monomorphic population distribution that yields the highest population average fitness when players interact with each other globally with equal probabilities. (15) x R = αp + α + 1 6αp + 2α + 2 .
(16) www.nature.com/scientificreports/ This result is quite puzzling. In this game, fairness strictly dominates rationality in the case of complete information, as shown in Tab.1. It means that it is better for a player to play fairly, regardless of the opponent's type. This seems to imply that a player does not need to know the opponent's type, because he will get a better payoff when he plays fairly than when he plays rationally. Then, how can rational players still survive evolutionarily if players cannot be sure of the opponent's type? Specifically, when p 0 ≈ 0 , how can it be possible that π IR > π IF in the case of incomplete information, although p 0 ≈ 0 means that the population consists only of rational players so Table 1 seems to suggest that only fair players can survive because π F > 3 32 ? The answer for this puzzle can be found from the difference between π F = 1 2 (π FR 1 + π RF 2 ) given in (13) and π IF = 1 2 (π IF 1 + π IF 2 ) given in (23) given in (27) when p 0 ≈ 0 . Although they look the same, the values of x R in the two formulas are different, depending on what the opponent is. In the former (in the complete information case), it is computed from the assumption that the second mover is fair. (We used the notation x * (α) instead of x R in the analysis of complete information case to distinguish them). In the latter (in the incomplete information case), however, the rational player chooses x R , expecting that the second mover is highly likely to be rational when p 0 ≈ 0 . (In other words, the true opponent type and the expected opponent type can be different in the case of incomplete information, whereas it is not possible in the case of complete information). So, x R in this case is larger and thus it is more likely to be rejected. Hence, π RF 2 is lower, so is the fitness of a fair player in the case of incomplete information. The overall intuition for the case of incomplete information goes as follows. If players have complete information about the opponent's type, a rational player (player 1) bids very high to the rational opponent, so rational player 2 is severely exploited, while he bids lower to the fair opponent, because he knows that the fair opponent (fair player 2) will bid so high that his high bid would be very likely to lead to a failure in bargaining. However, if players have incomplete information about the opponent's type, a rational player who is unsure of the opponent's type must bid lower than when he faces a rational opponent, and so a rational player 2 is not very much exploited. This is one of the main reasons why rational players fare better under incomplete information. Similarly, a fair player who is unsure of the opponent's type can bid higher if there is a high probability that the opponent is a rational type. So, incomplete information can have the role of making a rational player play more like a fair player and making a fair player play more like a rational player. (If a player is the second mover, his decision does not  Table 1. Payoff matrix (material payoffs). π R = 612α 4 +924α 3 +441α 2 +86α+6 16(α+1)(4α+1)(9α+2) 2 , π F = 1224α 5 +3492α 4 +3106α 3 +1169α 2 +196α+12 32(α+1) 2 (4α+1)(9α+2) 2 , π FF = (8α 2 +26α+3)(8α 2 +12α+1) www.nature.com/scientificreports/ depend on the opponent's type. So, the incomplete information of the opponent's type does not affect his choice under complete information). Under complete information, a rational player earns a low material payoff because he is significantly exploited when he is the second mover. On the other hand, under incomplete information, he is not so much exploited because the opponent is not sure whether he is rational or not. However, if the proportion of fair players is very high, it becomes an almost complete information game, and the advantage of fairness in the case of complete information is almost balanced with the advantage of rationality in the case of incomplete information, so both of the two types can survive and evolve over time.
Local interaction on a network. In this section, we consider a simple network structure on which players interact locally to play a divide-a-lottery game. For recent studies on local interaction in other situations such as the prisoners' dilemma game with social diversity or the snowdrift game, see Perc and Szolnoki 22 and Hauert and Doebeli 23 . Initially, there are n(= 100) players on a circle. The type of each player (R or F) is assigned randomly according to the pre-assigned ratio of fair players ( p 0 ). Here, we introduce two parameters, interaction radius ( r inter ) and information radius ( r infor ). Each player interact only with neighbors within the given r inter . As we investigated the cases of complete and incomplete information about the opponent's type in previous subsections, players know the type of her neighbor within the given r infor and does not know outside of the length r infor . Then, players observe their own fitness and their neighbors' within the interaction radius after they play the divide-a-lottery game with the neighbors. Finally, each player decides to change her type by imitating the type of her neighbors when the average fitness of her neighbors of different types is greater than the fitness of her own type. The simulation continues until the dynamics becomes stable. We observe the fraction of fair players at the final step of simulation ( p final ) with varying two parameters of initial fraction of fair players ( p 0 ) and fairness careness α as the average of 10 3 times of ensembles (see Fig. 3).
For given r inter = 10 and r infor = 1 , we can find that there are mainly two phases of final states: (1) At higher α and higher p 0 , only fair players survive (white region) and (2) rational players are predominant with higher α and lower p 0 values. This means that the emergence of fairness can be determined by the given condition of α and p 0 . Note that if α is too high, it may not be good for a fair player because his bid becomes lower (Fig. 3A). For different combinations of r inter and r infor , the results are qualitatively similar (see Fig. S1). At lower values of α , both types of players can coexist, as depicted in yellow color in Fig. 3A. This is also confirmed in Fig. 3B which illustrates how some initial distribution reaches a stationary spatial distribution over time by simulations. Note that in the case of some nodes, they oscillate unstably at first, and then become stable and maintain their types over time. Also, Fig. 3C shows that a polymorphic population consisting of both of rational players and fair players is better than the monomorphic population consisting only of fair players in terms of population average fitness. This is mainly because π F + π R > 2π FF for most parameter values of α . This figure implies that the evolution process in a local interaction favors the ultimate population distribution that yields high population average fitness.
Before closing this section, we highlight the intuition for why fair players can survive evolutionarily. If two rational players play the game, there is a first mover advantage. Player 1 preempts an advantageous position by making a high bid which makes player 2 makes a low bid. However, if player 2 is fair, player 1 cannot bid high because he knows that the fair opponent will not reduce his bid very much due to his concerns for fairness. This makes player 1 reduce his bid. So, as α is larger, i.e., player 2 is fairer, player 1 reduces his bid more so that player 2 increases his bid, and thus player 1's payoff gets smaller and player 2's payoff gets larger, until they bid the same  www.nature.com/scientificreports/ when α = 1 2 . If α exceeds 1 2 , player 2 begins to reduce his bid, although he still bids more than player 1. So, player 1's payoff begins increasing as α gets larger. Figure 4 shows this intuition.
If both players are fair, the situation is similar. In this case, the best response curves are both upward sloping. Since player 1 takes the best response function of player 2 as given, he will choose the optimal point along the best response curve of player 2 which is lower than x = 1 3 , the intersection of the two best response curves. Since the best response curve of player 1 is upward sloping, low x means low y. Since both x and y are reduced, both fair players get higher payoffs than rational players. It is illustrated in Fig. 5. x R (y) and y R (x) are reaction curves when both players are rational, and the red line denoted by y F (x) is the reaction curve when the player is fair. Blue curves are the two players' indifference curves that yield same utility to each player. The cap-shaped curve is player 1's indifference curve. His utility increases as it moves downward towards x-axis, while player 2's utility increases as it moves inward towards y-axis. If the second player is rational, his reaction curve is y R (x) , so player 1 chooses the point that gives the maximum utility on y R (x) . It is the tangent point of y R (x) and the indifference curve, (1/2, 1/4) if α = 0 . If the second player is fair, the tangent point of y F (x) and the indifference curve is (3/10, 13/40) if α = 1. Figure 5. Equilibrium bids when both players are fair If player 1 is fair, his indifference curve has the zero slope on x F (y) , not on x R (y) , because x F (y) is the optimal point for him given y. This figure shows the equilibrium bids (9/29, 19/58), which is the tangent point of player 1's indifference curve and player 2's reaction curve y F (x) , when both are fair, if α = 1.

Discussion
In this paper, we demonstrated that fair players can survive evolutionarily in a divide-a-lottery game. Moreover, we showed that rational players can also survive in the environment in which the bargaining players do not know each other's type until they play the bargaining game with the opponent, depending on the initial population distribution. Considering the reality that players often compete with their local neighbor whose type is not known (until they interact) and for the pie the value of which is uncertain, we believe that this result gives a sensible prediction in the real world.

Data availability
The datasets used and/or analysed during the current study available from the corresponding author on reasonable request. License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.